Protein Family Classi cation using Sparse Markov Transducers
نویسندگان
چکیده
In this paper we present a method for classifying proteins into families using sparse Markov transducers (SMTs). Sparse Markov transducers, similar to probabilistic suÆx trees, estimate a probability distribution conditioned on an input sequence. SMTs generalize probabilistic suÆx trees by allowing for wild-cards in the conditioning sequences. Because substitutions of amino acids are common in protein families, incorporating wildcards into the model signi cantly improves classi cation performance. We present two models for building protein family classi ers using SMTs. We also present eÆcient data structures to improve the memory usage of the models. We evaluate SMTs by building protein family classi ers using the Pfam database and compare our results to previously published results.
منابع مشابه
Protein Family Classification Using Sparse Markov Transducers
We present a method for classifying proteins into families based on short subsequences of amino acids using a new probabilistic model called sparse Markov transducers (SMT). We classify a protein by estimating probability distributions over subsequences of amino acids from the protein. Sparse Markov transducers, similar to probabilistic suffix trees, estimate a probability distribution conditio...
متن کاملImage Classification Based on a Multiresolution Two Dimensional Hidden Markov Model
This paper presents an image classi cation algorithm using a multiresolution two dimensional hidden Markov model (HMM). The multiresolution two dimensional hidden Markov model is an extension from the two dimensional hidden Markov model for image classi cation. A classi er estimates model parameters using the EM algorithm. Classi cation is then performed according to the maximum a posteriori pr...
متن کاملPIRSF: family classi®cation system at the Protein Information Resource
The Protein Information Resource (PIR) is an integrated public resource of protein informatics. To facilitate the sensible propagation and standardization of protein annotation and the systematic detection of annotation errors, PIR has extended its superfamily concept and developed the SuperFamily (PIRSF) classi®cation system. Based on the evolutionary relationships of whole proteins, this clas...
متن کاملHidden Markov Models for Silhouette Classi cation
In this paper, a new technique for object classi cation from silhouettes is presented. Hidden Markov Models are used as a classi cation mechanism. Through a set of experiments, we show the validity of our approach and show its invariance under severe rotation conditions. Also, a comparison with other techniques that use Hidden Markov Models for object classi cation from silhouettes is presented.
متن کاملJoint Video Scene Segmentation and Classification based on Hidden Markov Model
Video classi cation and segmentation are fundamental steps for e cient accessing, retrieving and browsing large amount of video data. We have developed a scene classi cation scheme using a Hidden MarkovModel (HMM)based classi er. By utilizing the temporal behaviors of di erent scene classes, HMM classi er can e ectively classify video segments into one of the prede ned scene classes. In this pa...
متن کامل